Current Issue : October-December Volume : 2022 Issue Number : 4 Articles : 5 Articles
When performing multiple target detection, it is difficult to detect small and occluded targets in complex traffic scenes. To this end, an improved YOLOv4 detection method is proposed in this work. Firstly, the network structure of the original YOLOv4 is adjusted, and the 4× downsampling feature map of the backbone network is introduced into the neck network of the YOLOv4 model to splice the feature map with 8× down-sampling to form a four-scale detection structure, which enhances the fusion of deep and shallow semantics information of the feature map to improve the detection accuracy of small targets. Then, the convolutional block attention module (CBAM) is added to the model neck network to enhance the learning ability for features in space and on channels. Lastly, the detection rate of the occluded target is improved by using the soft non-maximum suppression (Soft-NMS) algorithm based on the distance intersection over union (DIoU) to avoid deleting the bounding boxes. On the KITTI dataset, experimental evaluation is performed and the analysis results demonstrate that the proposed detection model can effectively improve the multiple target detection accuracy, and the mean average accuracy (mAP) of the improved YOLOv4 model reaches 81.23%, which is 3.18% higher than the original YOLOv4; and the computation speed of the proposed model reaches 47.32 FPS. Compared with existing popular detection models, the proposed model produces higher detection accuracy and computation speed....
Automatic garment size measurement approaches using computer vision algorithms have been attempted in various ways, but there are still many limitations to overcome. One limitation is that the process involves 2D images, which results in constraints in the process of determining the actual distance between the estimated points. To solve this problem, in this paper, we propose an automated method for measuring garment sizes using computer vision deep learning models and point cloud data. In the proposed method, a deep learning-based keypoint estimation model is first used to capture the clothing size measurement points from 2D images. Then, point cloud data from a LiDAR sensor are used to provide real-world distance information to calculate the actual clothing sizes. As the proposed method uses a mobile device equipped with a LiDAR sensor and camera, it is also more easily configurable than extant methods, which have varied constraints. Experimental results show that our method is not only precise but also robust in measuring the size regardless of the shape, direction, or design of the clothes in two different environments, with 1.59% and 2.08% of the average relative error, respectively.Automatic garment size measurement approaches using computer vision algorithms have been attempted in various ways, but there are still many limitations to overcome. One limitation is that the process involves 2D images, which results in constraints in the process of determining the actual distance between the estimated points. To solve this problem, in this paper, we propose an automated method for measuring garment sizes using computer vision deep learning models and point cloud data. In the proposed method, a deep learning-based keypoint estimation model is first used to capture the clothing size measurement points from 2D images. Then, point cloud data from a LiDAR sensor are used to provide real-world distance information to calculate the actual clothing sizes. As the proposed method uses a mobile device equipped with a LiDAR sensor and camera, it is also more easily configurable than extant methods, which have varied constraints. Experimental results show that our method is not only precise but also robust in measuring the size regardless of the shape, direction, or design of the clothes in two different environments, with 1.59% and 2.08% of the average relative error, respectively....
Human recognition models based on spatial-temporal graph convolutional neural networks have been gradually developed, and we present an improved spatial-temporal graph convolutional neural network to solve the problems of the high number of parameters and low accuracy of this type of model. The method mainly draws on the inception structure. First, the tensor rotation is added to the graph convolution layer to realize the conversion between graph node dimension and channel dimension and enhance the model’s ability to capture global information for small-scale tasks. Then the inception temporal convolution layer is added to build a multiscale temporal convolution filter to perceive temporal information under different time domains hierarchically from 4-time dimensions. It overcomes the shortcomings of temporal graph convolutional networks in the field of joint relevance of hidden layers and compensates for the information omission of small-scale graph tasks. It also limits the volume of parameters, decreases the arithmetic power, and speeds up the computation. In our experiments, we verify our model on the public dataset NTU RGB +D. Our method reduces the number of the model parameters by 50% and achieves an accuracy of 90% in the CS evaluation system and 94% in the CV evaluation system. The results show that our method not only has high recognition accuracy and good robustness in human behavior recognition applications but also has a small number of model parameters, which can effectively reduce the computational cost....
Due to the increase in the number of urban vehicles and the irregular driving behavior of drivers, urban accidents frequently occur, causing serious casualties and economic losses. Active vehicle safety systems can monitor vehicle status and driver status online in real time. Computer vision technology simulates biological vision and can analyze, identify, detect, and track the data and information in the captured images. In terms of driving accident warning and vehicle status warning, the vehicle active safety system has the potential to enhance the driver’s ability to detect abnormal situations, prolong the processing time, and reduce the risk of safety accidents. In this paper, an active safety system is developed according to the existing vehicle electronic system framework, and the early warning decision is made by evaluating the relationship between the minimum early warning distance and the actual vehicle distance, speed, and other factors. In this paper, the kinematics model established by the vehicle active safety early warning system is designed. The results found that, within 400 ms of the driver’s judgment time, for the driver with the reaction time of 0.6 s and 0.9 s, the following distance of 20m does not constitute a safety threat and no braking operation is required....
Bridge inspection plays a critical role in mitigating the safety risks associated with bridge deterioration and decay. CV (computer vision) technology can facilitate bridge inspection by accurately automating the structural recognition tasks, especially useful in UAV (unmanned aerial vehicles)-assisted bridge inspections. This study proposed a framework for the multilevel inspection of bridges based on CV technology, and provided verification using CNN (convolution neural network) models. Using a long-distance dataset, recognition of the bridge type was performed using the Resnet50 network. The dataset was built using internet image captures of 1200 images of arched bridges, cable-stayed bridges and suspension bridges, and the network was trained and evaluated. A classification accuracy of 96.29% was obtained. The YOLOv3 model was used to recognize bridge components in medium-distance bridge images. A dataset was created from 300 images of girders and piers collected from the internet, and image argumentation techniques and the tuning of model hyperparameters were investigated. A detection accuracy of 93.55% for the girders and 82.64% for the piers was obtained. For close-distance bridge images, segmentation and recognition of bridge components were investigated using the instance segmentation algorithm of the Mask–RCNN model. A dataset containing 800 images of girders and bearings was created, and annotated based on Yokohama City bridge inspection image records data. The trained model showed an accuracy of 90.8% for the bounding box and 87.17% for the segmentation. This study also contributed to research on bridge image acquisition, computer vision model comparison, hyperparameter tuning, and optimization techniques....
Loading....